Size Matters: Finding the Most Informative Set of Window Lengths
نویسندگان
چکیده
Event sequences often contain continuous variability at different levels. In other words, their properties and characteristics change at different rates, concurrently. For example, the sales of a product may slowly become more frequent over a period of several weeks, but there may be interesting variation within a week at the same time. To provide an accurate and robust “view” of such multi-level structural behavior, one needs to determine the appropriate levels of granularity for analyzing the underlying sequence. We introduce the novel problem of finding the best set of window lengths for analyzing discrete event sequences. We define suitable criteria for choosing window lengths and propose an efficient method to solve the problem. We give examples of tasks that demonstrate the applicability of the problem and present extensive experiments on both synthetic data and real data from two domains: text and DNA. We find that the optimal sets of window lengths themselves can provide new insight into the data, e.g., the burstiness of events affects the optimal window lengths for measuring the event frequencies.
منابع مشابه
Mining Frequent Patterns in Uncertain and Relational Data Streams using the Landmark Windows
Todays, in many modern applications, we search for frequent and repeating patterns in the analyzed data sets. In this search, we look for patterns that frequently appear in data set and mark them as frequent patterns to enable users to make decisions based on these discoveries. Most algorithms presented in the context of data stream mining and frequent pattern detection, work either on uncertai...
متن کاملEvaluation of effective factors in window optimization of fry analysis to identify mineralization pattern: Case study of Bavanat region, Iran
The known ore deposits and mineralization trends are important key exploration criteria in mineral exploration within a specific region. Fry analysis has conventionally been considered as a suitable method to determine the mineralization trends related to linear structures. Based upon literature sources, to date, no investigation has been carried out that includes the Sensitivity Analysis of Fe...
متن کاملExamination of Legal Matters Covered in Elementary School Textbooks
Theoretically speaking, among topics that need to be covered throughout the elementary school years, legal ones loom large. In order to determine the extent to which this coverage is actually present in elementary school textbooks, a set of 17 such books from among 51 in total, deemed appropriate for such matters to be covered in, was selected and content analyzed utilizing concepts related to ...
متن کاملTopological Relationship Between Wiener Index in contrast to the Energy and Electric Moments in TUVC6I2p, ill with Same Circumference and Various Lengths
Topological indices are one of the oldest and most widely used descriptors in Quantitative StructureProperties Relationvhips (QSPR). Amongst the topological indices used a,s descriptors in QSPIC., the Wienerindex is by far the most popular index. as it has been shown that the Wiener index has a strong correlationwith the chemical propenies of the compound.In this study, the relationship between...
متن کاملA Simple One-Dimensional Model for Investigation of Heat and Mass Transfer Effects on Removal Efficiency of Particulate Matters in a Venturi Scrubber
In the present study a mathematical model is developed in order to examine the effects of heat and mass transfers on removal efficiency of particulate matters in venturi type <span style="font-size: 10...
متن کامل